The IPython architecture consists of four components:
Dependencies:
To install on Ubuntu:
sudo apt-get install libzmq-dev
sudo easy_install pyzmq
To install on Ubuntu:
sudo pip install pyzmq
Parallel computing: Multiple processors or cores to distribute the work on a sinlge machine.
Distributed computing: Multiple processors or cores located on multiple machines to distribute the work on a several machines.
This will be our first approach to parellel/distributed computing.
I'll start the engines on a profile named "nbserver".
In [1]:
from IPython.parallel import Client
In [2]:
cluster = Client(profile="nbserver")
lb_view = cluster.load_balanced_view()
print "Profile: %s" % cluster.profile
print "Engines: %s" % len(lb_view)
In [3]:
def f(x):
result = 1.0
for counter in range(100000):
result = (result * x * 0.5)
if result % 5 == 0:
result -=4
return result
In [4]:
%%timeit -r 1 -n 1
result = []
for i in range(1000):
result.append(f(i))
Using load balanced view can help simplify the process of distributing code. There are two methods to implement this.
In [7]:
def f(x):
result = 1.0
for counter in range(100000):
result = (result * x * 0.5)
if result % 5 == 0:
result -=4
return result
In [9]:
%%timeit -r 1 -n 1
result = lb_view.map(f, range(1000), block=True)
Using parallel function decorator is by far the simplist way to implement parellel computing in IPython. It doesn't allow for much control but it is fast and works for most of the cases.
load_balanced_view.parallel(self, dist='b', block=None, **flags)
Decorator for making a ParallelFunction
In [10]:
@lb_view.parallel(block=True)
def f(x):
result = 1.0
for counter in range(100000):
result = (result * x * 0.5)
if result % 5 == 0:
result -=4
return result
In [15]:
result = f.map(range(1000))
print "Results Count: %s" % len(result)
In [16]:
@lb_view.parallel(block=False)
def f(x):
result = 1.0
for counter in range(100000):
result = (result * x * 0.5)
if result % 5 == 0:
result -=4
return result
In [31]:
result = f.map(range(1000))
In [34]:
result
Out[34]:
In [33]:
print "Results Count: %s" % len(result.result)